Felix Colibri- Cooking the Code Utility

Home

Cooking The Code - Felix John COLIBRI.

abstract : a source code filtering utility which removes unwanted rows, areas and comments. Complete with Unit Test.

key words : code transformation - code filtering - mark up - tStringList handling - unit test

software used : Windows XP Home, Delphi 6

hardware used : Pentium 2.800Mhz, 512 M memory, 140 G hard disc

scope : Delphi 1 to 2006, Turbo Delphi for Windows, Kylix
Delphi 5, 6, 7, 8 Delphi 2005, 2006, Turbo Delphi, Turbo 2007, Rad Studio 2007, Rad Studio 2009

level : Delphi developer

plan :

Why Source Code Filtering ?

The Delphi Code Cooker

Unit Test to Test the Code Cooker

Comments and Improvements

Download the Source Code

1 - Why Source Code Filtering ?

For most of our customers, we provide the source code of our contracted projects. For many of our customer applications, we use a developer version that includes parts which are of no interest the customer :

trial code
developper comments (todo list, add this, contact such and such for review, who corrected this little bit, see also some other project)
debug logging which could be more voluminous than usage logging

Before delivering the source, we could manually remove the unwanted parts, but, as with other "dual code" endeavours, this process is error prone and tedious:

we could miss some parts
we could remove too many lines
maintaining two versions of the code over time can become a nightmare

Therefore we use a small utility which removes the unwanted parts. This process is performed using simple innocuous comment markers and a companion filtering parser.

2 - The Delphi Code Cooker

2.1 - The Objective and the Grammar

The easiest way is to present a "before / after" example. Here is a small piece of code :

Let's assume that we want to remove

isolated rows or groups of rows
comments

To do so, we add markers, with the following effect (on the left the original, on the right, the filtered text):

to remove a row, we append "//-" at the end of the row

 
unit _cooker_test;            |  unit _cooker_test; 
  interface                   |    interface 
  implementation              |    implementation 
    BBB                       |      BBB 
    aaa //-                   | 
    CCC                       |      CCC

to remove row groups

"//>" and "//<"

either surrounding those lines

 
CCC                       |      CCC 
//>                       | 
bbb                       | 
ccc                       | 
//<                       | 
DDD                       |      DDD

or at the end of the first and the last line

 
DDD                       |      DDD 
ddd //>                   | 
eee                       | 
fff //<                   | 
EEE                       |      EEE

for the other types of comments

first the "//" comments:

the special "// --- " is used to keep a comment

 
EEE                       |      EEE 
// --- FFF                |      // --- FFF 
GGG                       |      GGG

all other start of line "//" comments will be removed

 
GGG                       |      GGG 
//                        | 
// ggg                    | 
// --  hhh                | 
HHH                       |      HHH

the end of line // comments are removed, except those after BEGIN and END;

 
HHH                       |      HHH 
III // --  jjj            |      III 
BEGIN //  JJJ             |      BEGIN // JJJ 
END; //  KKK              |      END; // KKK 
LLL                       |      LLL

for the the "(*" comments

the comments starting with (* on a line by itself will be removed

 
JJJ                       |      JJJ 
(*                        | 
jjj                       | 
kkk *)                    | 
(*                        | 
ppp *) QQQ                |      QQQ

other "(*" comments will remain

 
(* KKK                    |      (* KKK 
LLL *)                    |      LLL*) 
(*$R+*)                   |      (*$R+*) 
MMM (* NNN                |      MMM (* NNN 
OOO *) PPP                |      OOO *) PPP

the rules are the same for "{"

the comments starting with { on a line by itself will be removed

 
JJJ                       |      JJJ 
{                         | 
jjj                       | 
kkk }                     | 
{                         | 
ppp } QQQ                 |      QQQ

other "{" comments will remain

 
{ KKK                     |      { KKK 
LLL }                     |      LLL} 
{$R+}                     |      {$R+} 
MMM { NNN                 |      MMM { NNN 
OOO } PPP                 |      OOO } PPP

Please note that

the rules are somehow dependent on our coding habits. For instance we use
- "//" comments for explanations, and with two dashes, "// -- ", to hilite them from code
  
  // -- algorithm
  x:= 5;
- end of line "//" comments are to exclude previous values:
  
  If line> 150 // 10
  Then
- "(*" comments for contiguous bloc elimination (from the compilation). Those blocs can contain "//" comments. In addition, when we exclude some bloc from compilation, we place the "(*" and "*)" at the margin, on a separate line :
  
      x:= 5;
  (*
      // -- this is an explanation comment
      y:= 8;
  *)
      a:= b;
- "{" for non contiguous bloc elimination. So "{" comments can enclose several "(*" blocs
  
      a:= b;
  {
      c:= d;
  (*
      e:= f;
  *)
      g:= h;
  (*
      i:= j;
  *)
  }
      k:= l;
this explains why we chose
- the "//>" "//<" "//-" "// --- " as code cooking markers, since we never use those in our usual code
- the "(*" and "{" on single lines to remove blocs. If we decide to keep in the cooked code some commented out bloc, we simply add any character after the "(*" or "{" (like "(*+" for instance)
for commented out code, we use "(*" on a single line, and also the matching "*)" on a line by itself. But to stay coherent with the compiler, we accept that the matching "*)" be placed anywhere in a line

    x:= 5;
(*
    y:= 8;
*)
    a:= b;
(*
    d:= 9; *) e:= 18
    f:= 15;

2.2 - The Delphi Code

Basically we use a tStrings to analyze the text, looking for the different markers using

Copy for start of line extraction
Pos for the other lookups

2.2.1 - The Class definition

The worker Class is very simple

Type c_remove_marked_up_text=
         Class(c_basic_object)
           m_c_original_list: tStringList;
           m_c_result_list: tStringList;

           Constructor create_remove_marked_up_text(p_name: String);
           Procedure remove_marked_up_text;
           Destructor Destroy; Override;
         End; // c_remove_marked_up_text

2.2.2 - The main filtering loop

The main loop is also very simple

we read each line
we call Functions which test one of the marker rule, and
- decides how to transform the line (mainly only a keep or throw away choice)
- returns True if the line has been handled in this Function

Here is the main loop:

Procedure c_remove_marked_up_text.remove_marked_up_text;
Var l_list_index: Integer;
l_the_line, l_trimmed_line: String;

Function f_end_of_file: Boolean;
// -- ooo

Procedure read_line;
// -- ooo

Procedure add_result_line;
// -- ooo

Function f_remove_start_of_line_slash_comments: Boolean;
// -- ooo

Begin // remove_marked_up_text
m_c_result_list.Clear;

l_list_index:= 0;
read_line;

    While Not f_end_of_file Do
    Begin
      If f_remove_start_of_line_slash_comments Then Else
      If f_remove_parenthesis_star_comment Or f_remove_brace_comment Then Else
      If f_remove_ds_ending_slash_comments Then Else
      If f_remove_ds_endig_slash_minus_comments Then Else
      If f_remove_ds_middle_line_slash_comments Then Else
          Begin
            add_result_line;
            read_line;
          End;
    End; // while l_the_line

    If l_the_line<> ''
      Then add_result_line;
  End; // remove_marked_up_text

2.2.3 - Example of start of line marker

We handle the "//" at the start of the line in the following way:

Function f_remove_start_of_line_slash_comments: Boolean;
// -- True if has a "__//xxxxxx" line

  Procedure erase_slash_greater_bloc;
      // -- a "__//>  ... __//< bloc
    Begin
      Repeat
        read_line;
      Until f_end_of_file Or (Copy(l_the_line, l_index, 3)= '//<');
      read_line;
    End; // erase_slash_greater_bloc

  Begin // f_remove_start_of_line_slash_comments
    If Copy(l_the_line, l_index, 2)= '//'
      Then Begin
          Result:= True;
          // -- keep if "// --- "
          If Copy(l_the_line, l_index+ 2, 5)= ' --- '
            Then Begin
                add_result_line;
                read_line;
              End
            Else
              If Copy(l_the_line, l_index, 3)= '//>'
                Then erase_slash_greater_bloc
                Else read_line;
        End
      Else Result:= False;
  End; // f_remove_start_of_line_slash_comments

2.2.4 - Example of middle of line marker

We remove the middle "//" with code like this:

Function f_remove_ds_middle_line_slash_comments: Boolean;
    // -- "xxx // yyy"
    // -- CAUTION: no // within a string
  Var l_slash_slash_position: Integer;
  Begin
    l_slash_slash_position:= Pos('//', l_the_line);
    If (l_slash_slash_position> 1)
        And Not
          (
              (Pos('end; // ', LowerCase(l_the_line))> 0)
            Or
              (Pos('begin // ', LowerCase(l_the_line))> 0)
           )
      Then Begin
          Result:= True;
          Delete(l_the_line, l_slash_slash_position,
              Length(l_the_line)+ 1- l_slash_slash_position);
          add_result_line;
          read_line;
        End
      Else Result:= False;
  End; // f_remove_ds_middle_line_slash_comments

Please note that

this code works since we removed the start of line "//" and ending "//-", "//>", "//<" before (see the main loop). So our code is depending on the calling order of our filtering functions

2.2.5 - The main Form

The main form is quite standard. Here is a snapshot of this form:

cooking_the_code

where

the help is a memo saved in a .txt file (can be filled by the user)
the favorites are loaded from a .txt file (same function as a filter combo box, but with a fixed format instead of a drop down)
the source code to filter is defined by its path and file name
the destination is defined by its path (with same file name), and the purple edit can be used to create the sub folder
selecting a source .PAS file
- loads the text in the "original_" memo (where it still can be modified and saved to disc)
- computes the filtered result
- this result is presented in the "result_" memo (where we still can modify and save it)

3 - Unit Test to Test the Code Cooker

To test that our filtering routines correctly remove the marked up code, we wrote a unit test Class.

The test Class definition is

Type c_remove_marker_test=
         Class(c_test_case)
           Private
             m_c_remove_sncf: c_remove_marked_up_text;
           Protected
             Procedure Setup; Override;
             Procedure Teardown; Override;
           Published
             // -- all the tests
             Procedure test_slash_comment;
             Procedure test_slash_comment_enclosing;
             Procedure test_slash_eol_comment_enclosing;
             Procedure test_slash_keep_comment;
             Procedure test_slash_remove_comment;
             Procedure test_remove_eol_slash_comment;
             Procedure test_remove_parenthesis_star_comment;
             Procedure test_keep_parenthesis_star_comment;
             Procedure test_remove_brace_comment;
             Procedure test_keep_brace_comment;
         End; // c_remove_marker_test

and, as an example, here is the test of the enclosing "//>" and "//<" markers:

Procedure c_remove_marker_test.test_slash_comment_enclosing;
  Begin
    With m_c_remove_sncf, m_c_original_list Do
    Begin
      Add('    CCC');
      Add('    //>');
      Add('    bbb');
      Add('    ccc');
      Add('    //<');
      Add('    DDD');

remove_marked_up_text;

      Check(m_c_result_list.Strings[0]= m_c_original_list[0],
          '"enclosing //> //<", removed line 0') ;
      Check(m_c_result_list.Strings[1]= m_c_original_list[5],
          '"enclosing //> //<", removed line 5') ;

      Check(m_c_result_list.Count= Count- 4+ 1,
          '"enclosing //> //<", count') ;
    End; // with m_c_remove_sncf
  End; // test_slash_comment_enclosing

The result of the test looks like:

cooking_the_code_test

4 - Comments and Improvements

This tool suits our needs, because it neatly fits our coding conventions.

However, it can easily be improved. You might

change the markers (for instance, use "//%" instead of "//-")
use more conspicuous markers. We decided to make them as discreet as possible, since we do not want to notice them while developing the code
change the semantics (what you want to filter out or keep)
add additional filtering rules (or remove some of our own)

It could also be useful to add some checking proceudres (to check that the "//>" and "//<" count matches etc)

The filter is similar to weaving (we consider two aspects of the same code), but our process is unidirectional (there is no easy way to integrate customer changes in the original source).

There are some obvious drawbacks:

we are only removing areas. It would be quite tedious to remove variables or some procedure parameters
there is no check as to the consistency of the filtering (we could remove a class method definition, and not remove its implementation, or remove a Uses name, only to find out after compilation that some imported information is still used in the filtered code)
the technique is invasive (the original code is modified by inserting markers to perform the filtering)

In our working tool we also added the removal of procedure declaration, implementation and call. We hand over a list of procedure name, and the filter removes them everywhere. The nice thing is that this code is not invasive. It requires however a lexcical analyzer which could be avoided in the filter presented in this article.

Based on the success of this article, on time, and on popular demand ( :) ) we could (well, cook up, and then) publish this procedure filtering tool, with its scanner, parser and unit test code, in a companion article on this site.

As a last note, we used the term "cooking the code", as a nod to the very funny "cooking the book" expression.

5 - Download the Source Code

Here are the source code files:

cooking_the_code.zip: the main project and the filtering unit (37 K)
cooking_the_code_test.zip: the main project and the filtering unit (35 K)

The .ZIP file(s) contain:

the main program (.DPR, .DOF, .RES), the main form (.PAS, .DFM), and any other auxiliary form
any .TXT for parameters, samples, test data
all units (.PAS) for units

Those .ZIP

are self-contained: you will not need any other product (unless expressly mentioned).
for Delphi 6 projects, can be used from any folder (the pathes are RELATIVE)
will not modify your PC in any way beyond the path where you placed the .ZIP (no registry changes, no path creation etc).

To use the .ZIP:

create or select any folder of your choice
unzip the downloaded file
using Delphi, compile and execute

To remove the .ZIP simply delete the folder.

The Pascal code uses the Alsacian notation, which prefixes identifier by program area: K_onstant, T_ype, G_lobal, L_ocal, P_arametre, F_unction, C_lass etc. This notation is presented in the Alsacian Notation paper.
The .ZIP file(s) contain:

the main program (.DPROJ, .DPR, .RES), the main form (.PAS, .ASPX), and any other auxiliary form or files
any .TXT for parameters, samples, test data
all units (.PAS .ASPX and other) for units

Those .ZIP

are self-contained: you will not need any other product (unless expressly mentioned).
will not modify your PC in any way beyond the path where you placed the .ZIP (no registry changes, no path outside from the container path creation etc).

To use the .ZIP:

create or select any folder of your choice.
unzip the downloaded file
using Delphi, compile and execute

To remove the .ZIP simply delete the folder.

As usual:

please tell us at fcolibri@felix-colibri.com if you found some errors, mistakes, bugs, broken links or had some problem downloading the file. Resulting corrections will be helpful for other readers
we welcome any comment, criticism, enhancement, other sources or reference suggestion. Just send an e-mail to fcolibri@felix-colibri.com.
or more simply, enter your (anonymous or with your e-mail if you want an answer) comments below and clic the "send" button

Name :

E-mail :

Comments * :
and if you liked this article, talk about this site to your fellow developpers, add a link to your links page ou mention our articles in your blog or newsgroup posts when relevant. That's the way we operate: the more traffic and Google references we get, the more articles we will write.

6 - References

The presentation and source code of the Unit Test project is presented in the Unit Test Framework article

7 - The author

Felix John COLIBRI works at the Pascal Institute. Starting with Pascal in 1979, he then became involved with Object Oriented Programming, Delphi, Sql, Tcp/Ip, Html, UML. Currently, he is mainly active in the area of custom software development (new projects, maintenance, audits, BDE migration, Delphi Xe_n migrations, refactoring), Delphi Consulting and Delph training. His web site features tutorials, technical papers about programming with full downloadable source code, and the description and calendar of forthcoming Delphi, FireBird, Tcp/IP, Web Services, OOP / UML, Design Patterns, Unit Testing training sessions.

Created: jun-09. Last updated: jul-15 - 98 articles, 131 .ZIP sources, 1012 figures
Copyright © Felix J. Colibri http://www.felix-colibri.com 2004 - 2015. All rigths reserved
Back: Home Papers Training Delphi developments Links Download